Processing in Hyperspectral Images (HSIs)
Anna Androvitsanea
anna.androvitsanea@gmail.com
Table of contents
# import libraries
import scipy.io as sio
import pandas as pd
import numpy as np
import scipy.optimize
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
import matplotlib.patches as mpatches
from mpl_toolkits.mplot3d import Axes3D
from scipy.linalg import norm
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
import math
import mpl_toolkits.mplot3d.art3d as art3d
import matplotlib.colors as clr
from numpy import linalg as LA
from scipy.optimize import nnls
from cvxopt import matrix, solvers
from scipy.spatial import distance
from sklearn.metrics import confusion_matrix
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import StratifiedKFold
from matplotlib.pyplot import cm
from mpl_toolkits.axes_grid1 import make_axes_locatable
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import MultiTaskLasso
from sklearn.linear_model import MultiTaskLassoCV
from sklearn.linear_model import LassoCV
from sklearn.model_selection import cross_val_predict
from sklearn.metrics import precision_score, recall_score
from sklearn.metrics import f1_score
from sklearn.metrics import classification_report
# import data
Pavia = sio.loadmat('PaviaU_cube.mat')
HSI = Pavia['X'] #Pavia HSI : 300x200x103
ends = sio.loadmat('PaviaU_endmembers.mat') # Endmember's matrix: 103x9
endmembers = ends['endmembers']
# import ground truth
ground_truth = sio.loadmat('PaviaU_ground_truth.mat')
labels = ground_truth['y']
# plot spectral signatures for 9 endmembers
fig = plt.figure(figsize=(10,4))
plt.plot(endmembers)
plt.ylabel('Radiance values')
plt.xlabel('Spectral bands')
plt.title('9 Endmembers spectral signatures at Pavia University HSI')
plt.show()
# make a dict mapping the 9 endmembers to the corresponding material name
materials = {0: 'unknown',
1: 'Water',
2: 'Trees',
3: 'Asphalt',
4: 'Bricks',
5: 'Bitumen',
6: 'Tiles',
7: 'Shadows',
8: 'Meadows',
9: 'Bare Soil'}
# Plot the spectral signature for each endmember
fig = plt.figure()
colors = ['royalblue','green','dimgray','firebrick', 'silver',
'cadetblue','black','lawngreen', 'chocolate']
for i in range (0,9):
plt.plot(endmembers[:,0], c=colors[i])
plt.ylabel('Radiance values')
plt.xlabel('Spectral bands')
plt.title('Spectral signature of endmember %d (%s) of Pavia University HSI' % (i+1, materials[i+1]))
plt.show()
# plot labels, icluding zero values
plt.figure(figsize=(20,10)) # set fig size
# make a dict to assign a distinct color to each material/surface
color_dict= {0: 'oldlace', 1: 'royalblue', 2: 'green', 3: 'dimgray', 4: 'firebrick', 5: 'silver',
6: 'cadetblue', 7: 'black', 8: 'lawngreen', 9: 'chocolate'}
cmaps_shuffle = {1:'Blues',2:'Greens',3:'Greys', 4:'Oranges', 5:'gist_yarg',
6:'Purples',7:'Greys',8:'Greens',9:'YlOrBr'}
# make a color map form the dict
cmap = ListedColormap([color_dict[x] for x in color_dict.keys()])
# make a legend based on the color and material
patches = [mpatches.Patch(color=color_dict[i],label = materials[i]) for i in range(0,10)]
plt.legend(handles=patches, bbox_to_anchor = (1.3, 1), loc=2, borderaxespad=0. )
plt.imshow(labels, cmap = cmap) # plot the array
plt.title('Visualization of the ground truth for the land cover\nof Pavia University HSI')
plt.colorbar() # plot the color bar
plt.savefig('ground_truth_raw.png') # save figure
# plot labels without the zero values
X = np.ma.masked_equal(labels, 0) # mask zero values
plt.figure(figsize=(20,10)) # set fig size
# make a dict to assign a distinct color to each material/surface
color_dict_masked = {1: 'royalblue', 2: 'green', 3: 'dimgray', 4: 'firebrick', 5: 'silver',
6: 'cadetblue', 7: 'black', 8: 'lawngreen', 9: 'chocolate'}
materials_masked = {1: 'Water',
2: 'Trees',
3: 'Asphalt',
4: 'Bricks',
5: 'Bitumen',
6: 'Tiles',
7: 'Shadows',
8: 'Meadows',
9: 'Bare Soil'}
# make a color map form the dict
cmap_masked = ListedColormap([color_dict_masked[x] for x in color_dict_masked.keys()])
# make a legend based on the color and material
patches_masked =[mpatches.Patch(color=color_dict_masked[i],label=materials_masked[i]) for i in range(1,10)]
plt.legend(handles=patches_masked, bbox_to_anchor=(1.3, 1), loc=2, borderaxespad=0. )
plt.imshow(X, cmap = cmap_masked) # plot the array
plt.title('Visualization of the ground truth with masked zero values \nfor the land cover of Pavia University HSI')
plt.colorbar() # plot the color bar
plt.savefig('ground_truth_raw_masked.png') # save figure
# plot data
plt.figure(figsize=(20,10))
plt.imshow(HSI[:,:,10], cmap = 'inferno')
plt.title('RGB Visualization of the $10^{th}$ band of Pavia University HSI')
plt.show()
In this part I consider the set consisting of the 9 endmembers, where each endmember has a specific spectral signature, which corresponds to the pure pixels in the HSI dataset.
For a given pixel in the image, the aim is to determine the percentage (abundance) that each pure material contributes in its formation and unmix the pixels.
I fit linear regression models that connect the 9 endmembers with the mixed pixels of the HSI dataset. I consider the following models:
(a) Least squares-Least-squares)
(b) Least squares imposing the sum-to-one constraint for $\theta$s,-Least-squares-imposing-the-sum-to-one-constraint)
(c) Least squares imposing the sum-to-one constraint for $\theta$s,-Least-squares-imposing-the-non-negativity-constraint)
(d) Least squares imposing both the non-negativity and the sum-to-one constraint for $\theta$s,-Least-squares-imposing-both-the-non-negativity-and-the-sum-to-one-constraint)
(e) LASSO, imposing sparsity on $\theta$s via $l_1$ norm minimization.-LASSO-impose-sparsity-via-l_1-norm-minimization)
First, I calculate the abundance maps for each material, i.e. 9 maps.
Then I compute the reconstruction error as follows:
I calculate the reconstruction error (for each non-zero class label) of each pixel using the formula: $$error = \frac{||\mathbf{y}_i - \mathbf{X}\mathbb{\theta}_i||^2}{||\mathbf{y}_i||^2}$$ Then, for N pixels I compute the average value: $$\text{reconstruction error} = \frac{error}{N} $$
Finaly, I compare the results obtained from the above five methods based on the abundance maps and the reconstruction error.
# Linear regression without constraints
# I perform the regression for those pixels
# that have non zero label
# at the ground truth matrix
# Those with zero label will get a zero array
# I want to ensure the integrity
# of the rows x columns of the image
XTXinv = np.linalg.inv(endmembers.T @ endmembers) # calculate the endmembers.T * endmembers
thetas = []
for row in range (0, 300):
for column in range (0, 200):
if labels[row][column] != 0: # perform the regression for the non zero values
y = np.reshape(HSI[row,column,:], (103, 1)) # reshape the mixed signature of the pixel
theta = XTXinv @ endmembers.T @ y # calculate the theta estimators
else:
theta = np.reshape(np.zeros(9), (9,1)) # store a 9x1 array with zeros
thetas.append(theta)
thetas_ar = np.array(thetas) # transform to array
thetas_ar_resh = np.reshape(thetas_ar, (300,200, 9))
thetas_ar_resh.shape
(300, 200, 9)
# unmixing the pixel
y_unmixed_ols = thetas_ar_resh @ endmembers.T # calculate unmixed pixel with reshaped array
y_unmixed_ols.shape
(300, 200, 103)
# plot thetas for all endmembers vs the masked ground truth
for endmember in range(0,9):
fig, (ax1, ax2) = plt.subplots(1, 2)
divider1 = make_axes_locatable(ax1)
divider2 = make_axes_locatable(ax2)
cax1 = divider1.append_axes('right', size='5%', pad=0.05)
cax2 = divider2.append_axes('right', size='5%', pad=0.05)
im1 = ax1.imshow(thetas_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember + 1])
im2 = ax2.imshow(np.ma.masked_where(labels != (endmember + 1),
labels),
cmap = ListedColormap([color_dict[endmember + 1]]))
fig.colorbar(im1, cax=cax1, orientation='vertical')
fig.colorbar(im2, cax=cax2, orientation='vertical')
plt.subplots_adjust(right=1.0)
ax1.set_title(r'Abundance map: $\theta_{%d}$: Endmember %d' %
(endmember+1, endmember+1))
ax2.set_title('Ground truth')
ax2.legend(handles=patches_masked,
bbox_to_anchor=(1.3, 1),
loc=2, borderaxespad=0.)
fig.suptitle(r'Linear regression: $\theta_{%d}$ value vs ground truth' %
(endmember+1))
# calculate the reconstruction error
reco_error = 0
N = 0
for row in range (0, 300):
for column in range (0, 200):
if labels[row][column] != 0: # perform the regression for the non zero values
y = np.reshape(HSI[row,column,:], (103, 1)) # reshape the mixed signature of the pixel
theta_reshaped = np.reshape(thetas_ar_resh[row,column,:], (9,1)) # reshape the thetas matrix
error_init = ((LA.norm(y - endmembers @ theta_reshaped))**2) / ((LA.norm(y))**2) # error for pixel
N += 1 # keep count of calculated pixels
reco_error += error_init # sum of errors
reconstruction_error_ls = reco_error / N # average value of all pixels' reconstruction errors
print('The reconstruction error is:', reconstruction_error_ls)
The reconstruction error is: 0.0014528012621974951
I am going to apply the cvxopt function in order to introduce constraints to the system of linear regression.
Quadratic systems can be solved via the solvers.qp() function. As an example, I can solve the quadratic problem
In this case I want to minimize the problem:
\begin{array}{ll} ||\mathbf{y} - \mathbf{x}\mathbf{\theta} ||^2 \Rightarrow \\ (\mathbf{y} - \mathbf{x}\mathbf{\theta})^T(\mathbf{y} - \mathbf{x}\mathbf{\theta}) \Rightarrow \\ (\mathbf{y}^T - \mathbf{x}^T\mathbf{\theta}^T) (\mathbf{y} - \mathbf{x}\mathbf{\theta}) \Rightarrow \\ \mathbf{y}^T\mathbf{y} - \mathbf{x}^T\mathbf{\theta}^T\mathbf{y} - \mathbf{y}^Τ\mathbf{x}\mathbf{\theta} + \mathbf{x}^Τ\mathbf{\theta}^Τ \mathbf{x}\mathbf{\theta}\\ \mbox{I ignore the element } \mathbf{y}^T\mathbf{y} \mbox{ since is not subjct to $\theta$}: \boxed{\mathbf{\theta}^T\mathbf{x}^T\mathbf{x}\mathbf{\theta}-2\mathbf{y}^T \mathbf{x} \mathbf{\theta} } \Rightarrow \\ P = 2\mathbf{x}^T\mathbf{x} \mbox{ and } q^T = -2\mathbf{y}^T\mathbf{x} \end{array}thetas_sum_one = []
A = matrix(np.ones((1,9)))
b = matrix(np.array([1.]))
P = matrix(2 * endmembers.T @ endmembers)
for row in range (0, 300):
for column in range (0, 200):
if labels[row][column] != 0: # perform the regression for the non zero values
q = -2 * matrix(endmembers.T @ HSI[row,column,:])
sol = solvers.qp(P, q, A = A, b = b)
theta = np.array(sol['x'])
else:
theta = np.reshape(np.zeros(9), (9,1)) # store a 9x1 array with zeros
thetas_sum_one.append(theta)
thetas_sum_one_ar = np.array(thetas_sum_one) # transform to array
thetas_sum_one_ar_resh = np.reshape(thetas_sum_one_ar, (300,200, 9))
thetas_sum_one_ar_resh.shape
(300, 200, 9)
# unmixing the pixel
y_unmixed_sum_one = thetas_sum_one_ar_resh @ endmembers.T # calculate unmixed pixel with reshaped array
y_unmixed_sum_one.shape
(300, 200, 103)
# plot thetas for all endmembers vs the masked ground truth
for endmember in range(0,9):
fig, (ax1, ax2) = plt.subplots(1, 2)
divider1 = make_axes_locatable(ax1)
divider2 = make_axes_locatable(ax2)
cax1 = divider1.append_axes('right', size='5%', pad=0.05)
cax2 = divider2.append_axes('right', size='5%', pad=0.05)
im1 = ax1.imshow(thetas_sum_one_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember + 1])
im2 = ax2.imshow(np.ma.masked_where(labels != (endmember + 1),
labels),
cmap = ListedColormap([color_dict[endmember + 1]]))
fig.colorbar(im1, cax=cax1, orientation='vertical')
fig.colorbar(im2, cax=cax2, orientation='vertical')
plt.subplots_adjust(right=1.0)
ax1.set_title(r'Abundance map: $\theta_{%d}$: Endmember %d' %
(endmember+1, endmember+1))
ax2.set_title('Ground truth')
ax2.legend(handles=patches_masked,
bbox_to_anchor=(1.3, 1),
loc=2, borderaxespad=0.)
fig.suptitle(r'Linear regression under sum-to-one constraint: $\theta_{%d}$ value vs ground truth' %
(endmember+1))
# calculate the reconstruction error
reco_error = 0
N = 0
for row in range (0, 300):
for column in range (0, 200):
if labels[row][column] != 0: # perform the regression for the non zero values
y = np.reshape(HSI[row,column,:], (103, 1)) # reshape the mixed signature of the pixel
theta_reshaped = np.reshape(thetas_sum_one_ar_resh[row,column,:], (9,1)) # reshape the thetas matrix
error_init = ((LA.norm(y - endmembers @ theta_reshaped))**2) / ((LA.norm(y))**2) # error for pixel
N += 1 # keep count of calculated pixels
reco_error += error_init # sum of errors
reconstruction_error_sum_one = reco_error / N # average value of all pixels' reconstruction errors
print('The reconstruction error is:', reconstruction_error_sum_one)
The reconstruction error is: 0.001965098636016303
Quadratic programs can be solved via the solvers.qp() function as above-Least-squares-imposing-the-sum-to-one-constraint) .
thetas_non_neg = []
P = matrix(2 * endmembers.T @ endmembers)
G = - matrix(np.eye(9))
h = matrix(np.zeros(9))
for row in range (0, 300):
for column in range (0, 200):
if labels[row][column] != 0: # perform the regression for the non zero values
q = -2 * matrix(endmembers.T @ HSI[row,column,:])
sol = solvers.qp(P, q, G = G, h = h, options = {'show_progress': False})
theta = np.array(sol['x'])
else:
theta = np.reshape(np.zeros(9), (9,1)) # store a 9x1 array with zeros
thetas_non_neg.append(theta)
thetas_non_neg_ar = np.array(thetas_non_neg) # transform to array
thetas_non_neg_ar_resh = np.reshape(thetas_non_neg_ar, (300,200, 9))
thetas_non_neg_ar_resh.shape
(300, 200, 9)
# unmixing the pixel
y_unmixed_non_neg = thetas_non_neg_ar_resh @ endmembers.T # calculate unmixed pixel with reshaped array
y_unmixed_non_neg.shape
(300, 200, 103)
for endmember in range(0,9):
fig, (ax1, ax2) = plt.subplots(1, 2)
divider1 = make_axes_locatable(ax1)
divider2 = make_axes_locatable(ax2)
cax1 = divider1.append_axes('right', size='5%', pad=0.05)
cax2 = divider2.append_axes('right', size='5%', pad=0.05)
im1 = ax1.imshow(thetas_non_neg_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember + 1])
im2 = ax2.imshow(np.ma.masked_where(labels != (endmember + 1),
labels),
cmap = ListedColormap([color_dict[endmember + 1]]))
fig.colorbar(im1, cax=cax1, orientation='vertical')
fig.colorbar(im2, cax=cax2, orientation='vertical')
plt.subplots_adjust(right=1.0)
ax1.set_title(r'Abundance map: $\theta_{%d}$: Endmember %d' %
(endmember+1, endmember+1))
ax2.set_title('Ground truth')
ax2.legend(handles=patches_masked,
bbox_to_anchor=(1.3, 1),
loc=2, borderaxespad=0.)
fig.suptitle(r'Linear regression under non-negativity constraint: $\theta_{%d}$ value vs ground truth' %
(endmember+1))
# calculate the reconstruction error
reco_error = 0
N = 0
for row in range (0, 300):
for column in range (0, 200):
if labels[row][column] != 0: # perform the regression for the non zero values
y = np.reshape(HSI[row,column,:], (103, 1)) # reshape the mixed signature of the pixel
theta_reshaped = np.reshape(thetas_non_neg_ar_resh[row,column,:], (9,1)) # reshape the thetas matrix
error_init = ((LA.norm(y - endmembers @ theta_reshaped))**2) / ((LA.norm(y))**2) # error for pixel
N += 1 # keep count of calculated pixels
reco_error += error_init # sum of errors
reconstruction_error_non_neg = reco_error / N # average value of all pixels' reconstruction errors
print('The reconstruction error is:', reconstruction_error_non_neg)
The reconstruction error is: 0.004186132012950644
Quadratic programs can be solved via the solvers.qp() function as above-Least-squares-imposing-the-sum-to-one-constraint) .
thetas_sum_one_non_neg = []
A = matrix(np.ones((1,9)))
b = matrix(np.array([1.]))
P = matrix(2 * endmembers.T @ endmembers)
G = -matrix(np.eye(9))
h = matrix(np.zeros(9))
for row in range (0, 300):
for column in range (0, 200):
if labels[row][column] != 0: # perform the regression for the non zero values
q = -2 * matrix(endmembers.T @ HSI[row,column,:])
sol = solvers.qp(P, q, A = A, b = b, G = G, h = h, options = {'show_progress': False})
theta = np.array(sol['x'])
else:
theta = np.reshape(np.zeros(9), (9,1)) # store a 9x1 array with zeros
thetas_sum_one_non_neg.append(theta)
thetas_sum_one_non_neg_ar = np.array(thetas_sum_one_non_neg) # transform to array
thetas_sum_one_non_neg_ar_resh = np.reshape(thetas_sum_one_non_neg_ar, (300,200, 9))
thetas_sum_one_non_neg_ar_resh.shape
(300, 200, 9)
# unmixing the pixel
# calculate unmixed pixel with reshaped array
y_unmixed_sum_one_non_neg = thetas_sum_one_non_neg_ar_resh @ endmembers.T
y_unmixed_sum_one_non_neg.shape
(300, 200, 103)
# plot thetas for all endmembers vs the masked ground truth
for endmember in range(0,9):
fig, (ax1, ax2) = plt.subplots(1, 2)
divider1 = make_axes_locatable(ax1)
divider2 = make_axes_locatable(ax2)
cax1 = divider1.append_axes('right', size='5%', pad=0.05)
cax2 = divider2.append_axes('right', size='5%', pad=0.05)
im1 = ax1.imshow(thetas_sum_one_non_neg_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember + 1])
im2 = ax2.imshow(np.ma.masked_where(labels != (endmember + 1),
labels),
cmap = ListedColormap([color_dict[endmember + 1]]))
fig.colorbar(im1, cax=cax1, orientation='vertical')
fig.colorbar(im2, cax=cax2, orientation='vertical')
plt.subplots_adjust(right=1.0)
ax1.set_title(r'Abundance map: $\theta_{%d}$: Endmember %d' %
(endmember+1, endmember+1))
ax2.set_title('Ground truth')
ax2.legend(handles=patches_masked,
bbox_to_anchor=(1.3, 1),
loc=2, borderaxespad=0.)
fig.suptitle(r'Linear regression under non-negativity and the sum-to-one constraint: $\theta_{%d}$ value vs ground truth' %
(endmember+1))
# calculate the reconstruction error
reco_error = 0
N = 0
for row in range (0, 300):
for column in range (0, 200):
if labels[row][column] != 0: # perform the regression for the non zero values
y = np.reshape(HSI[row,column,:], (103, 1)) # reshape the mixed signature of the pixel
theta_reshaped = np.reshape(thetas_sum_one_non_neg_ar_resh[row,column,:], (9,1)) # reshape the thetas matrix
error_init = ((LA.norm(y - endmembers @ theta_reshaped))**2) / ((LA.norm(y))**2) # error for pixel
N += 1 # keep count of calculated pixels
reco_error += error_init # sum of errors
reconstruction_error_non_neg_sum_one = reco_error / N # average value of all pixels' reconstruction errors
print('The reconstruction error is:', reconstruction_error_non_neg_sum_one)
The reconstruction error is: 0.01255289820688441
I want to minimize the quantity $(1 / (2 * n_{samples})) * ||Y - X_W||^2_{Fro} + \alpha * ||W||_211$ by imposing sparcity on $\theta$ with L1/L2 mixed-norm as regularizer.
$||W||_{21} = \sum_i \sqrt{\sum_j w_{ij}^2}$ is the sum of norm of each row.
I will implement the class sklearn.linear_model.MultiTaskLasso() which calculates the Lasso linear model with iterative fitting along a regularization path.
# set the model and fit the non zero HSI values
no_zero = np.where(labels!=0)
HSI_no_zero = HSI[no_zero[0],no_zero[1],:].T
lasso_model = MultiTaskLasso(alpha = 1e+6, tol = 0.01,
max_iter = 1e+4, warm_start = True,
fit_intercept = False)
fitted_lasso = lasso_model.fit(endmembers, HSI_no_zero)
fitted_lasso.coef_.shape
(12829, 9)
# transform the array of thetas from LASSO
# to a 300 x 200 x 9 array (image)
lasso_map = np.zeros((300,200,9))
for index in range(len(no_zero[0])):
row, col = no_zero[0][index],no_zero[1][index]
for endmember in range(0, 9):
lasso_map[row, col, endmember] = fitted_lasso.coef_[index,
endmember]
lasso_abund_map = np.ma.masked_equal(lasso_map, 0) # mask zero values
# unmixing the pixel
unmixed_lasso = endmembers @ fitted_lasso.coef_.T
unmixed_lasso.T.shape
# arrange in 300 * 200 array
unmixed_lasso_array = np.zeros((300,200,9))
for index in range(len(no_zero[0])):
row, col = no_zero[0][index],no_zero[1][index]
for endmember in range(0, 9):
unmixed_lasso_array[row, col, endmember] = unmixed_lasso.T[index,
endmember]
# plot thetas for all endmembers vs the masked ground truth
for endmember in range(0,9):
fig, (ax1, ax2) = plt.subplots(1, 2)
divider1 = make_axes_locatable(ax1)
divider2 = make_axes_locatable(ax2)
cax1 = divider1.append_axes('right', size='5%', pad=0.05)
cax2 = divider2.append_axes('right', size='5%', pad=0.05)
im1 = ax1.imshow(lasso_abund_map[:,:,endmember],
cmap = cmaps_shuffle[endmember + 1])
im2 = ax2.imshow(np.ma.masked_where(labels != (endmember + 1),
labels),
cmap = ListedColormap([color_dict[endmember + 1]]))
fig.colorbar(im1, cax=cax1, orientation='vertical')
fig.colorbar(im2, cax=cax2, orientation='vertical')
plt.subplots_adjust(right=1.0)
ax1.set_title(r'Abundance map: $\theta_{%d}$: Endmember %d' %
(endmember+1, endmember+1))
ax2.set_title('Ground truth')
ax2.legend(handles=patches_masked,
bbox_to_anchor=(1.3, 1),
loc=2, borderaxespad=0.)
fig.suptitle(r'Linear regression under LASSO constraint: $\theta_{%d}$ value vs ground truth' %
(endmember+1))
# find the reconstruction error for lasso
def errors_lasso(matrix, no_zero):
error = np.zeros((300,200))
unmixed = endmembers @ matrix
for i in range(len(no_zero[0])):
row, col = no_zero[0][i],no_zero[1][i]
y_min_x_theta = np.linalg.norm(HSI[row, col,: ] -
unmixed[:, i]) **2
y_sq = np.linalg.norm(HSI[row, col, :]) **2
error[row, col] = y_min_x_theta / y_sq
return error
lasso_errors = errors_lasso(fitted_lasso.coef_.T, np.where(labels != 0))
reco_lasso_error = lasso_errors.sum() / len(np.where(labels!=0)[0])
print('The reconstruction error is:', reco_lasso_error)
The reconstruction error is: 0.01005315762992714
# Least squares errors
print('Least squares error = {:>30}' .format(round(reconstruction_error_ls,4)))
print('Least squares sum-to-one error = {:>18}'.format(round(reconstruction_error_sum_one,4)))
print('Least squares non-negative error = {:>17}'.format(round(reconstruction_error_non_neg,4)))
print('Least squares sum-to-one non-negative error = {:>}'.format(round(reconstruction_error_non_neg_sum_one,4)))
print('Least squares LASSO error = {:>24}' .format(round(reco_lasso_error,4)))
Least squares error = 0.0015 Least squares sum-to-one error = 0.002 Least squares non-negative error = 0.0042 Least squares sum-to-one non-negative error = 0.0126 Least squares LASSO error = 0.0101
Among the regressors I examine, the minimum least squares error is achieved by the ols regressor and the maximum error by ols with the sum-to-one and non-negative constraint for the $\theta_i$ values.
However, I want to see how the regressors perform spatially, i.e. how they reconstruct the map of the campus.
For this scope, I will test each material separately.
# set a function to plot each label in order to compare on a
# one-by-one basis
def function_to_plot(endmember):
fig, [[ax1, ax2, ax3], [ax4, ax5, ax6]] = plt.subplots(figsize = (30,20), nrows = 2, ncols = 3)
divider1 = make_axes_locatable(ax1)
divider2 = make_axes_locatable(ax2)
divider3 = make_axes_locatable(ax3)
divider4 = make_axes_locatable(ax4)
divider5 = make_axes_locatable(ax5)
divider6 = make_axes_locatable(ax6)
cax1 = divider1.append_axes('right', size='5%', pad=0.05)
cax2 = divider2.append_axes('right', size='5%', pad=0.05)
cax3 = divider3.append_axes('right', size='5%', pad=0.05)
cax4 = divider4.append_axes('right', size='5%', pad=0.05)
cax5 = divider5.append_axes('right', size='5%', pad=0.05)
cax6 = divider6.append_axes('right', size='5%', pad=0.05)
im1 = ax1.imshow(thetas_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im2 = ax2.imshow(thetas_sum_one_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im3 = ax3.imshow(thetas_non_neg_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im4 = ax4.imshow(thetas_sum_one_non_neg_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im5 = ax5.imshow(lasso_abund_map[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im6 = ax6.imshow(np.ma.masked_where(labels != (endmember + 1),
labels),
cmap = ListedColormap([color_dict[endmember + 1]]))
fig.colorbar(im1, cax=cax1, orientation='vertical')
fig.colorbar(im2, cax=cax2, orientation='vertical')
fig.colorbar(im3, cax=cax3, orientation='vertical')
fig.colorbar(im4, cax=cax4, orientation='vertical')
fig.colorbar(im5, cax=cax5, orientation='vertical')
fig.colorbar(im6, cax=cax6, orientation='vertical')
ax1.set_title(r'Linear Reg.: No constraint',
fontsize = 25)
ax2.set_title('Sum-to-one', fontsize = 25)
ax3.set_title('Non-negativity', fontsize = 25)
ax4.set_title('Sum-to-one\n non-negativity', fontsize = 25)
ax5.set_title('LASSO', fontsize = 25)
ax6.set_title('Ground truth for endmember %d' %
((endmember+1)), fontsize = 25)
fig.suptitle(r'Abundance maps of the linear regressors: $\theta_{%d}$ value vs ground truth for %s' %
((endmember+1), materials_masked[endmember+1]), fontsize = 30)
function_to_plot(0)
Comparing the ground truth for water against the abundance maps of the regressors I notice the following:
function_to_plot(1)
Comparing the ground truth for trees against the abundance maps of the regressors I notice the following:
function_to_plot(2)
Comparing the ground truth for asphalt against the abundance maps of the regressors I notice the following:
None of the five regression models produce a satisfactory result when identifying the roads (asphalt).
function_to_plot(3)
Comparing the ground truth for bricks against the abundance maps of the regressors I notice the following:
None of the five regression models produce a satisfactory result when identifying the bricks.
function_to_plot(4)
Comparing the ground truth for bitumen against the abundance maps of the regressors I notice the following:
The non-negativity constraint regressor succeeds almost perfectly in identifying the bitumen at the campus.
The simple ols and sum-to-one constraint regressors are close behind; they also identify the bitumen, while incorrectly assigning a similar $\theta$ value to some other landscape features.
The sum-to-one non-negativity constraint regressor identifies the bitumen but incorrectly assigns a similar $\theta$ value to a lot of other landscape features.
Finally, the LASSO regressor scores very poorly. It identifies the bitumen but incorrectly assigns a similar $\theta$ value to many other landscape features.
function_to_plot(5)
Comparing the ground truth for tiles against the abundance maps of the regressors I notice the following:
The non-negativity constraint regressor succeeds almost perfectly in identifying the tiles at the campus.
The second best performance is that of the sum-to-one non-negativity constraint regressor which also identifies the tiles, while incorrectly assigning a similar $\theta$ value to the trees.
The simple ols and the sum-to-one constraint regressors correctly identify the tiles but they incorrectly assign a similar $\theta$ value to a lot of other landscape features.
Finally, the LASSO regression is the worst of the pack, identifying the tiles but also incorrectly assigning a similar $\theta$ value to many other landscape features.
function_to_plot(6)
Comparing the ground truth for shadows against the abundance maps of the regressors I notice the following:
The sum-to-one non-negativity constraint regressors succeeds best in identifying the shadows at the campus. However it assigns a similar $\theta$ value to the torrent as well.
The second best performance is that of the non-negativity constraint regressor which also identifies the shadows, but incorrectly assigns a similar $\theta$ value to the torrent and the meadows.
The LASSO regression correctly identifies the shadows but incorrectly assigns a similar $\theta$ value to a lot of other landscape features.
Finally, the simple ols and the sum-to-one constraint regressors come in last, assigning similar $\theta$ values to most of the campus.
function_to_plot(7)
Comparing the ground truth for meadows against the abundance maps of the regressors I notice the following:
The non-negativity constraint regressors succeeds best in identifying the meadows at the campus. However it assigns a similar $\theta$ value to the torrent and some asphalt features as well.
The second best performance is that of the sum-to-one non-negativity constraint regressor which identifies the meadows, but incorrectly assigns a similar $\theta$ value to the torrent and the meadows.
The LASSO regression correctly identifies the meadows but incorrectly assigns a similar $\theta$ value to a lot of other landscape features.
Finally, the simple ols and the sum-to-one constraint regressors incorrectly assign a similar $\theta$ value to most of the campus.
function_to_plot(8)
Comparing the ground truth for bare soil against the abundance maps of the regressors I notice the following:
The non-negativity constraint regressors succeeds best in representing the bare soil at the campus, while incorrectly assigning a similar $\theta$ value to the torrent and the bitumen.
The second best performance is that of the sum-to-one non-negativity constraint regressor which succeeds in representing the bare soil at the campus. However it assigns a similar $\theta$ value to the torrent, the bitumen, some tiles and some other features as well.
The simple ols and the sum-to-one constraint regressors correctly identify the bare soil but they incorrectly assign a similar $\theta$ value to a lot of other features.
The LASSO regression comes in last, as it doesn't identify the bare soil at all..
Summing up I get that the following results by comparing the abundance map of each material against the ground truth:
| Material | best fit | second best fit |
|---|---|---|
| Water | non-negativity constraint | sum-to-one - non-negativity-constraint |
| Trees | non-negativity constraint | sum-to-one - non-negativity-constraint |
| Asphalt | non-negativity constraint | sum-to-one - non-negativity-constraint |
| Bricks | non-negativity constraint | sum-to-one - non-negativity-constraint |
| Bitumen | non-negativity constraint | simple ols |
| Tiles | sum-to-one constraint | sum-to-one - non-negativity-constraint |
| Shadows | non-negativity constraint | sum-to-one constraint |
| Meadows | non-negativity constraint | sum-to-one - non-negativity-constraint |
| Bare Soil | non-negativity constraint | sum-to-one - non-negativity-constraint |
Additionally, in the reconstruction error section above, I calculate the following values:
| Method | Error |
|---|---|
| Least squares error | 0.0015 |
| Least squares sum-to-one error | 0.002 |
| Least squares non-negative error | 0.0042 |
| Least squares sum-to-one non-negative error | 0.0126 |
| Least squares LASSO error | 0.0101 |
The Least squares non-negativity constraint regression error is in the middle of the variance of the errors.
Since it performs the best compared to the other regressors, it comes out on top in this case study.
# Trainining set for classification
# import data
Pavia_labels = sio.loadmat('classification_labels_Pavia.mat')
# prepare test set
Test_Set = (np.reshape(Pavia_labels['test_set'],(200,300))).T
# prepare operational set
Operational_Set = (np.reshape(Pavia_labels['operational_set'],(200,300))).T
# prepare training set
Training_Set = (np.reshape(Pavia_labels['training_set'],(200,300))).T
# compute the indexes in order to exclude zero values
train = np.where(Training_Set != 0)
test = np.where(Test_Set != 0)
operate = np.where(Operational_Set != 0)
# apply index to exlude zero values
# prepare the endmembers datasets
train_labels = Training_Set[train[0], train[1]]
test_labels = Test_Set[test[0], test[1]]
operate_label = Operational_Set[operate[0], operate[1]]
len(train_labels), len(test_labels), len(operate_label)
(6415, 3207, 3207)
# apply index to exlude zero values
# prepare the pixels datasets
train_hsi = HSI[train[0], train[1],:]
test_hsi = HSI[test[0], test[1],:]
operate_hsi = HSI[operate[0], operate[1]]
len(train_hsi), len(test_hsi), len(operate_hsi)
(6415, 3207, 3207)
# Indices (rows and columns) stored in 2 arrays, corresponding to non-zero labels
# in the train and test sets respectively.
non_zero_train = np.where(Training_Set!=0)
non_zero_test = np.where(Test_Set!=0)
training_labels = Training_Set[non_zero_train[0],non_zero_train[1]]
training_pixels = HSI[non_zero_train[0],non_zero_train[1],:]
test_labels = Test_Set[non_zero_test[0],non_zero_test[1]]
test_pixels = HSI[non_zero_test[0],non_zero_test[1],:]
# Plot labels
plt.imshow(labels, cmap = 'CMRmap') # plot the array
plt.colorbar() # plot the color bar
<matplotlib.colorbar.Colorbar at 0x7f97d7aea6a0>
# Plot test set
plt.imshow(Test_Set, cmap = 'CMRmap') # plot the array
plt.colorbar() # plot the color bar
<matplotlib.colorbar.Colorbar at 0x7f97e9c57bb0>
# Plot dev set
plt.imshow(Operational_Set, cmap = 'CMRmap') # plot the array
plt.colorbar() # plot the color bar
<matplotlib.colorbar.Colorbar at 0x7f97e8ae8460>
# Plot training set
plt.imshow(Training_Set, cmap = 'CMRmap') # plot the array
plt.colorbar() # plot the color bar
<matplotlib.colorbar.Colorbar at 0x7f97d40d0040>
The task is to assign each one of them to the most appropriate class among the 9 known endmembers (classes).
The classification is perfomed with four pre-chosen classifiers:
The first step-Classification-for-each-classifier) is the performance of a 10-fold cross validation for each classifier. For this step the estimated validation error gets reported.
The second step is the training of each classifier and the evaluation of its performance. This includes the computation of the confusion matrix and the success rate of the classifier.
Finally-Comparison-of-classifiers) I compare the results of the four classifiers.
cv_naive_bayes = cross_val_score(GaussianNB(),
X = train_hsi,
y = train_labels,
cv = StratifiedKFold(n_splits = 10,
shuffle = True))
error_naive_bayes = 1 - cv_naive_bayes
print('The Naive Bayes classifier produces an estimated validation error with mean %.4f and standard deviation %.4f' %
(error_naive_bayes.mean(), error_naive_bayes.std()))
The Naive Bayes classifier produces an estimated validation error with mean 0.3394 and standard deviation 0.0178
# train model, predict for test labels
# calculate the confusion matrix and success rate
model_naive_bayes = GaussianNB().fit(train_hsi,
train_labels).predict(test_hsi)
naive_confusion_m = confusion_matrix(test_labels,
model_naive_bayes)
success_naive_bayes = naive_confusion_m.trace() / naive_confusion_m.sum()
print('The Naive Bayes classifier has a success rate of %.4f.' % success_naive_bayes)
The Naive Bayes classifier has a success rate of 0.6601.
# view the confusion matrix
pd.DataFrame(naive_confusion_m, index = [materials_masked[i] for i in range(1, 10)],
columns = [materials_masked[i] for i in range(1, 10)])
| Water | Trees | Asphalt | Bricks | Bitumen | Tiles | Shadows | Meadows | Bare Soil | |
|---|---|---|---|---|---|---|---|---|---|
| Water | 131 | 0 | 37 | 0 | 0 | 0 | 80 | 13 | 0 |
| Trees | 0 | 326 | 4 | 6 | 0 | 17 | 0 | 0 | 0 |
| Asphalt | 25 | 2 | 127 | 0 | 0 | 13 | 70 | 299 | 0 |
| Bricks | 0 | 0 | 0 | 154 | 1 | 1 | 0 | 0 | 0 |
| Bitumen | 0 | 0 | 1 | 0 | 166 | 1 | 0 | 0 | 0 |
| Tiles | 0 | 312 | 2 | 55 | 32 | 363 | 0 | 0 | 0 |
| Shadows | 18 | 0 | 26 | 0 | 0 | 0 | 277 | 0 | 0 |
| Meadows | 2 | 1 | 67 | 0 | 0 | 1 | 2 | 388 | 0 |
| Bare Soil | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 185 |
folds_eucl = StratifiedKFold(n_splits = 10,
shuffle = True).split(train_hsi,
train_labels)
def average_value(endmember, pixel):
average = np.zeros((9,103))
for label in range(1,10):
class_index = np.where(endmember == label)[0]
class_points = pixel[class_index]
class_average = class_points.mean(axis=0)
average[label - 1,:] = class_average
return average
errors_euclidean = []
for fold in folds_eucl:
temp_train_hsi = train_hsi[fold[0]]
temp_train_labels = train_labels[fold[0]]
temp_val_pixels = train_hsi[fold[1]]
temp_val_labels = train_labels[fold[1]]
# Average value for each endmember
# Temporar training set
temp_average = average_value(temp_train_labels,
temp_train_hsi)
# Distances between temporar training set and validation set.
temp_distances = distance.cdist(temp_average,
temp_val_pixels)
# Predict based on the minimum distance.
temp_predictions = temp_distances.argmin(axis=0) + 1
temp_confusion_m = confusion_matrix(temp_val_labels,
temp_predictions)
cv_euclidean = (temp_confusion_m.trace() / temp_confusion_m.sum())
temp_error = 1 - cv_euclidean
errors_euclidean.append(temp_error)
errors_euclidean_ar = np.array(errors_euclidean)
print('The minimum Euclidean distance classifer produces an estimated validation error with mean %.4f and standard deviation %.4f' %
(errors_euclidean_ar.mean(), errors_euclidean_ar.std()))
The minimum Euclidean distance classifer produces an estimated validation error with mean 0.4338 and standard deviation 0.0170
# train model, predict for test labels
# calculate the confusion matrix and success rate
average = np.zeros((9,103))
for label in range(1,10):
class_index = np.where(Training_Set == label)
class_points = HSI[class_index[0],class_index[1],:]
class_average = class_points.mean(axis=0)
average[label - 1,:] = class_average
distances = distance.cdist(average, test_hsi)
minimum_distances_pred = distances.argmin(axis=0) + 1
euclidean_confusion_m = confusion_matrix(test_labels,
minimum_distances_pred)
success_euclidean = euclidean_confusion_m.trace() / euclidean_confusion_m.sum()
print('The Minimum Euclidean Distance classifier has a success rate of %.4f.'
% success_euclidean)
The Minimum Euclidean Distance classifier has a success rate of 0.5578.
# view the confusion matrix
pd.DataFrame(euclidean_confusion_m, index = [materials_masked[i] for i in range(1, 10)],
columns = [materials_masked[i] for i in range(1, 10)])
| Water | Trees | Asphalt | Bricks | Bitumen | Tiles | Shadows | Meadows | Bare Soil | |
|---|---|---|---|---|---|---|---|---|---|
| Water | 152 | 0 | 46 | 0 | 0 | 0 | 61 | 2 | 0 |
| Trees | 1 | 188 | 0 | 5 | 0 | 156 | 0 | 3 | 0 |
| Asphalt | 66 | 2 | 198 | 0 | 0 | 1 | 39 | 230 | 0 |
| Bricks | 0 | 0 | 0 | 154 | 0 | 0 | 0 | 0 | 2 |
| Bitumen | 0 | 0 | 0 | 0 | 128 | 0 | 0 | 40 | 0 |
| Tiles | 11 | 317 | 0 | 12 | 16 | 240 | 0 | 168 | 0 |
| Shadows | 61 | 0 | 23 | 0 | 0 | 0 | 237 | 0 | 0 |
| Meadows | 2 | 1 | 145 | 0 | 0 | 1 | 7 | 305 | 0 |
| Bare Soil | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 187 |
cv_knn = cross_val_score(KNeighborsClassifier(7),
X = train_hsi,
y = train_labels,
cv = StratifiedKFold(n_splits = 10,
shuffle = True))
error_kk = 1 - cv_knn
print('The k-nearest neighbor classifier produces an estimated validation error with mean %.4f and standard deviation %.4f' %
(error_kk.mean(), error_kk.std()))
The k-nearest neighbor classifier produces an estimated validation error with mean 0.1200 and standard deviation 0.0113
# train model, predict for test labels
# calculate the confusion matrix and success rate
neigh_classifier = KNeighborsClassifier(n_neighbors = 7)
model_kneigh = neigh_classifier.fit(train_hsi,
train_labels).predict(test_hsi)
knn_confusion_m = confusion_matrix(test_labels, model_kneigh)
success_knn = knn_confusion_m.trace() / knn_confusion_m.sum()
print('The k-nearest neighbor classifier has a success rate of %.4f.'
% success_knn)
The k-nearest neighbor classifier has a success rate of 0.8884.
# view the confusion matrix
pd.DataFrame(knn_confusion_m, index = [materials_masked[i] for i in range(1, 10)],
columns = [materials_masked[i] for i in range(1, 10)])
| Water | Trees | Asphalt | Bricks | Bitumen | Tiles | Shadows | Meadows | Bare Soil | |
|---|---|---|---|---|---|---|---|---|---|
| Water | 192 | 0 | 10 | 0 | 0 | 0 | 26 | 33 | 0 |
| Trees | 0 | 325 | 0 | 1 | 0 | 27 | 0 | 0 | 0 |
| Asphalt | 10 | 2 | 454 | 0 | 0 | 4 | 1 | 65 | 0 |
| Bricks | 0 | 0 | 0 | 155 | 0 | 1 | 0 | 0 | 0 |
| Bitumen | 0 | 0 | 1 | 0 | 166 | 0 | 0 | 1 | 0 |
| Tiles | 0 | 59 | 1 | 0 | 1 | 701 | 0 | 2 | 0 |
| Shadows | 11 | 0 | 3 | 0 | 0 | 0 | 304 | 3 | 0 |
| Meadows | 8 | 2 | 84 | 0 | 0 | 0 | 2 | 365 | 0 |
| Bare Soil | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 187 |
cv_bayes = cross_val_score(QuadraticDiscriminantAnalysis(),
X = train_hsi,
y = train_labels,
cv = StratifiedKFold(n_splits = 10,
shuffle = True))
error_bayes = 1 - cv_bayes
print('The Bayes classifier produces an estimated validation error with mean %.4f and standard deviation %.4f' %
(error_bayes.mean(), error_bayes.std()))
The Bayes classifier produces an estimated validation error with mean 0.1217 and standard deviation 0.0120
# train model, predict for test labels
# calculate the confusion matrix and success rate
module_bayes = QuadraticDiscriminantAnalysis().fit(train_hsi,
train_labels)
model_bayes = module_bayes.predict(test_hsi)
bayes_confusion_m = confusion_matrix(test_labels,
model_bayes)
success_bayes = bayes_confusion_m.trace() / bayes_confusion_m.sum()
print('The Bayes classifier has a success rate of %.4f.' % success_bayes)
The Bayes classifier has a success rate of 0.8843.
# view the confusion matrix
pd.DataFrame(bayes_confusion_m, index = [materials_masked[i] for i in range(1, 10)],
columns = [materials_masked[i] for i in range(1, 10)])
| Water | Trees | Asphalt | Bricks | Bitumen | Tiles | Shadows | Meadows | Bare Soil | |
|---|---|---|---|---|---|---|---|---|---|
| Water | 155 | 0 | 46 | 0 | 0 | 2 | 10 | 48 | 0 |
| Trees | 0 | 328 | 0 | 3 | 0 | 22 | 0 | 0 | 0 |
| Asphalt | 10 | 1 | 430 | 0 | 0 | 0 | 0 | 95 | 0 |
| Bricks | 0 | 0 | 0 | 154 | 0 | 2 | 0 | 0 | 0 |
| Bitumen | 0 | 0 | 0 | 0 | 168 | 0 | 0 | 0 | 0 |
| Tiles | 0 | 1 | 0 | 1 | 0 | 762 | 0 | 0 | 0 |
| Shadows | 14 | 0 | 10 | 0 | 0 | 2 | 291 | 4 | 0 |
| Meadows | 19 | 0 | 73 | 0 | 0 | 2 | 0 | 367 | 0 |
| Bare Soil | 3 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 181 |
print('Naive Bayes: {:>20} = {:>1}, standard deviation = {:>5}' .format('mean',
round(error_naive_bayes.mean(),4),
round(error_naive_bayes.std(), 4)))
print('Minimum Euclidean distance : mean = {:>5}, standard deviation = {:>5}' .format(round(errors_euclidean_ar.mean(),4),
round(errors_euclidean_ar.std(), 4)))
print('K-nearest neighbor: {:>13} = {:>1}, standard deviation = {:>5}' .format('mean',
round(error_kk.mean(),4),
round(error_kk.std(), 4)))
print('Bayesian: {:>23} = {:>1}, standard deviation = {:>5}' .format('mean',
round(error_bayes.mean(),4),
round(error_bayes.std(), 4)))
Naive Bayes: mean = 0.3394, standard deviation = 0.0178 Minimum Euclidean distance : mean = 0.4338, standard deviation = 0.017 K-nearest neighbor: mean = 0.12, standard deviation = 0.0113 Bayesian: mean = 0.1217, standard deviation = 0.012
The Bayesian method has the smallest mean value of validation error as well as the smallest deviation.
After that comes the K-nearest neighbor classifier.
# view the confusion matrix
pd.DataFrame(naive_confusion_m, index = [materials_masked[i] for i in range(1, 10)],
columns = [materials_masked[i] for i in range(1, 10)])
| Water | Trees | Asphalt | Bricks | Bitumen | Tiles | Shadows | Meadows | Bare Soil | |
|---|---|---|---|---|---|---|---|---|---|
| Water | 131 | 0 | 37 | 0 | 0 | 0 | 80 | 13 | 0 |
| Trees | 0 | 326 | 4 | 6 | 0 | 17 | 0 | 0 | 0 |
| Asphalt | 25 | 2 | 127 | 0 | 0 | 13 | 70 | 299 | 0 |
| Bricks | 0 | 0 | 0 | 154 | 1 | 1 | 0 | 0 | 0 |
| Bitumen | 0 | 0 | 1 | 0 | 166 | 1 | 0 | 0 | 0 |
| Tiles | 0 | 312 | 2 | 55 | 32 | 363 | 0 | 0 | 0 |
| Shadows | 18 | 0 | 26 | 0 | 0 | 0 | 277 | 0 | 0 |
| Meadows | 2 | 1 | 67 | 0 | 0 | 1 | 2 | 388 | 0 |
| Bare Soil | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 185 |
print('The Naive Bayes classifier has a success rate of %.4f.' % success_naive_bayes)
The Naive Bayes classifier has a success rate of 0.6601.
The Naive Bayes classifier performs a little more than the average, with a success rate of 66 %. I notice that types of land cover such as trees, bricks, bitumen, shadows and bare soil get identified very well, while some other types are misidentified, ie asphalt is very often identified as meadows or tiles are trees.
# view the confusion matrix
pd.DataFrame(euclidean_confusion_m, index = [materials_masked[i] for i in range(1, 10)],
columns = [materials_masked[i] for i in range(1, 10)])
| Water | Trees | Asphalt | Bricks | Bitumen | Tiles | Shadows | Meadows | Bare Soil | |
|---|---|---|---|---|---|---|---|---|---|
| Water | 152 | 0 | 46 | 0 | 0 | 0 | 61 | 2 | 0 |
| Trees | 1 | 188 | 0 | 5 | 0 | 156 | 0 | 3 | 0 |
| Asphalt | 66 | 2 | 198 | 0 | 0 | 1 | 39 | 230 | 0 |
| Bricks | 0 | 0 | 0 | 154 | 0 | 0 | 0 | 0 | 2 |
| Bitumen | 0 | 0 | 0 | 0 | 128 | 0 | 0 | 40 | 0 |
| Tiles | 11 | 317 | 0 | 12 | 16 | 240 | 0 | 168 | 0 |
| Shadows | 61 | 0 | 23 | 0 | 0 | 0 | 237 | 0 | 0 |
| Meadows | 2 | 1 | 145 | 0 | 0 | 1 | 7 | 305 | 0 |
| Bare Soil | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 187 |
print('The Minimum Euclidean Distance classifier has a success rate of %.4f.'
% success_euclidean)
The Minimum Euclidean Distance classifier has a success rate of 0.5578.
The Minimum Euclidean Distance classifier performs just a bit above 50 %. Most types of land cover are misidentified, with the only exception being the Bare soil.
# view the confusion matrix
pd.DataFrame(knn_confusion_m, index = [materials_masked[i] for i in range(1, 10)],
columns = [materials_masked[i] for i in range(1, 10)])
| Water | Trees | Asphalt | Bricks | Bitumen | Tiles | Shadows | Meadows | Bare Soil | |
|---|---|---|---|---|---|---|---|---|---|
| Water | 192 | 0 | 10 | 0 | 0 | 0 | 26 | 33 | 0 |
| Trees | 0 | 325 | 0 | 1 | 0 | 27 | 0 | 0 | 0 |
| Asphalt | 10 | 2 | 454 | 0 | 0 | 4 | 1 | 65 | 0 |
| Bricks | 0 | 0 | 0 | 155 | 0 | 1 | 0 | 0 | 0 |
| Bitumen | 0 | 0 | 1 | 0 | 166 | 0 | 0 | 1 | 0 |
| Tiles | 0 | 59 | 1 | 0 | 1 | 701 | 0 | 2 | 0 |
| Shadows | 11 | 0 | 3 | 0 | 0 | 0 | 304 | 3 | 0 |
| Meadows | 8 | 2 | 84 | 0 | 0 | 0 | 2 | 365 | 0 |
| Bare Soil | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 187 |
print('The k-nearest neighbor classifier has a success rate of %.4f.'
% success_knn)
The k-nearest neighbor classifier has a success rate of 0.8884.
The k-nearest neighbor classifier performs very well with a success rate close to 90%. Bare soil is in all case correctly identified, while most of the types of land cover are also correctly attributed to the relevant pixel. The only outliner is the type Meadows that is identified as Asphalt in many cases.
# view the confusion matrix
pd.DataFrame(bayes_confusion_m, index = [materials_masked[i] for i in range(1, 10)],
columns = [materials_masked[i] for i in range(1, 10)])
| Water | Trees | Asphalt | Bricks | Bitumen | Tiles | Shadows | Meadows | Bare Soil | |
|---|---|---|---|---|---|---|---|---|---|
| Water | 155 | 0 | 46 | 0 | 0 | 2 | 10 | 48 | 0 |
| Trees | 0 | 328 | 0 | 3 | 0 | 22 | 0 | 0 | 0 |
| Asphalt | 10 | 1 | 430 | 0 | 0 | 0 | 0 | 95 | 0 |
| Bricks | 0 | 0 | 0 | 154 | 0 | 2 | 0 | 0 | 0 |
| Bitumen | 0 | 0 | 0 | 0 | 168 | 0 | 0 | 0 | 0 |
| Tiles | 0 | 1 | 0 | 1 | 0 | 762 | 0 | 0 | 0 |
| Shadows | 14 | 0 | 10 | 0 | 0 | 2 | 291 | 4 | 0 |
| Meadows | 19 | 0 | 73 | 0 | 0 | 2 | 0 | 367 | 0 |
| Bare Soil | 3 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 181 |
print('The Bayes classifier has a success rate of %.4f.' % success_bayes)
The Bayes classifier has a success rate of 0.8843.
The Bayes classifier also performs very well with a success rate close to 90%. As before, bare soil is in all case correctly identified. All other types of land cover are also correctly attributed to the relevant pixel. The only outliner is again the type Meadows that is identified as Asphalt in many cases.
The Bayes classifier has the most diagonal confusion matrix of all four classifiers and the most zero or very small non-diagonal values.
Then the k-nearest neighbor classifier follows with a bit more non-diagonal non-zero data.
# report statistical metrics for each classifier
naive_bayes_report = classification_report(test_labels,model_naive_bayes,
output_dict = True)
eucl_report = classification_report(test_labels,minimum_distances_pred,
output_dict = True)
knn_report = classification_report(test_labels,model_kneigh,
output_dict = True)
bayes_report = classification_report(test_labels,model_bayes,
output_dict = True)
# make dataframes for the reports
df_naive_report = pd.DataFrame(data = naive_bayes_report).transpose()
df_naive_report.columns = pd.MultiIndex.from_product([['Naive Bayes classifier'],df_naive_report.columns])
df_min_euc_report = pd.DataFrame(data = eucl_report).transpose()
df_min_euc_report.columns = pd.MultiIndex.from_product([['Minimum Euclidean distance classifier'],df_min_euc_report.columns])
df_knn_report = pd.DataFrame(data = knn_report).transpose()
df_knn_report.columns = pd.MultiIndex.from_product([['k-nearest neighbor classifier'],df_knn_report.columns])
df_bayes_report = pd.DataFrame(data = bayes_report).transpose()
df_bayes_report.columns = pd.MultiIndex.from_product([['Bayes classifier'],df_bayes_report.columns])
# combine dataframes
pd.concat([df_naive_report.round(3), df_min_euc_report.round(3),
df_knn_report.round(3), df_bayes_report.round(3)], axis = 1)
| Naive Bayes classifier | Minimum Euclidean distance classifier | k-nearest neighbor classifier | Bayes classifier | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| precision | recall | f1-score | support | precision | recall | f1-score | support | precision | recall | f1-score | support | precision | recall | f1-score | support | |
| 1 | 0.744 | 0.502 | 0.600 | 261.00 | 0.519 | 0.582 | 0.549 | 261.000 | 0.869 | 0.736 | 0.797 | 261.000 | 0.771 | 0.594 | 0.671 | 261.000 |
| 2 | 0.509 | 0.924 | 0.656 | 353.00 | 0.370 | 0.533 | 0.437 | 353.000 | 0.838 | 0.921 | 0.877 | 353.000 | 0.994 | 0.929 | 0.960 | 353.000 |
| 3 | 0.481 | 0.237 | 0.318 | 536.00 | 0.481 | 0.369 | 0.418 | 536.000 | 0.821 | 0.847 | 0.834 | 536.000 | 0.769 | 0.802 | 0.785 | 536.000 |
| 4 | 0.710 | 0.987 | 0.826 | 156.00 | 0.901 | 0.987 | 0.942 | 156.000 | 0.994 | 0.994 | 0.994 | 156.000 | 0.969 | 0.987 | 0.978 | 156.000 |
| 5 | 0.834 | 0.988 | 0.905 | 168.00 | 0.889 | 0.762 | 0.821 | 168.000 | 0.994 | 0.988 | 0.991 | 168.000 | 0.988 | 1.000 | 0.994 | 168.000 |
| 6 | 0.917 | 0.475 | 0.626 | 764.00 | 0.603 | 0.314 | 0.413 | 764.000 | 0.956 | 0.918 | 0.937 | 764.000 | 0.962 | 0.997 | 0.979 | 764.000 |
| 7 | 0.646 | 0.863 | 0.739 | 321.00 | 0.689 | 0.738 | 0.713 | 321.000 | 0.913 | 0.947 | 0.930 | 321.000 | 0.967 | 0.907 | 0.936 | 321.000 |
| 8 | 0.554 | 0.842 | 0.668 | 461.00 | 0.408 | 0.662 | 0.505 | 461.000 | 0.778 | 0.792 | 0.785 | 461.000 | 0.714 | 0.796 | 0.753 | 461.000 |
| 9 | 1.000 | 0.989 | 0.995 | 187.00 | 0.989 | 1.000 | 0.995 | 187.000 | 1.000 | 1.000 | 1.000 | 187.000 | 1.000 | 0.968 | 0.984 | 187.000 |
| accuracy | 0.660 | 0.660 | 0.660 | 0.66 | 0.558 | 0.558 | 0.558 | 0.558 | 0.888 | 0.888 | 0.888 | 0.888 | 0.884 | 0.884 | 0.884 | 0.884 |
| macro avg | 0.710 | 0.756 | 0.703 | 3207.00 | 0.650 | 0.661 | 0.643 | 3207.000 | 0.907 | 0.905 | 0.905 | 3207.000 | 0.904 | 0.887 | 0.893 | 3207.000 |
| weighted avg | 0.696 | 0.660 | 0.639 | 3207.00 | 0.583 | 0.558 | 0.552 | 3207.000 | 0.890 | 0.888 | 0.888 | 3207.000 | 0.887 | 0.884 | 0.884 | 3207.000 |
report_all = pd.concat([df_naive_report.round(3), df_min_euc_report.round(3),
df_knn_report.round(3), df_bayes_report.round(3)], axis = 1)
report_all.iloc[0:9,:].describe()
| Naive Bayes classifier | Minimum Euclidean distance classifier | k-nearest neighbor classifier | Bayes classifier | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| precision | recall | f1-score | support | precision | recall | f1-score | support | precision | recall | f1-score | support | precision | recall | f1-score | support | |
| count | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 | 9.000000 |
| mean | 0.710556 | 0.756333 | 0.703667 | 356.333333 | 0.649889 | 0.660778 | 0.643667 | 356.333333 | 0.907000 | 0.904778 | 0.905000 | 356.333333 | 0.903778 | 0.886667 | 0.893333 | 356.333333 |
| std | 0.181918 | 0.278698 | 0.197068 | 201.262267 | 0.229526 | 0.241253 | 0.230143 | 201.262267 | 0.084013 | 0.094020 | 0.085059 | 201.262267 | 0.116159 | 0.134605 | 0.122468 | 201.262267 |
| min | 0.481000 | 0.237000 | 0.318000 | 156.000000 | 0.370000 | 0.314000 | 0.413000 | 156.000000 | 0.778000 | 0.736000 | 0.785000 | 156.000000 | 0.714000 | 0.594000 | 0.671000 | 156.000000 |
| 25% | 0.554000 | 0.502000 | 0.626000 | 187.000000 | 0.481000 | 0.533000 | 0.437000 | 187.000000 | 0.838000 | 0.847000 | 0.834000 | 187.000000 | 0.771000 | 0.802000 | 0.785000 | 187.000000 |
| 50% | 0.710000 | 0.863000 | 0.668000 | 321.000000 | 0.603000 | 0.662000 | 0.549000 | 321.000000 | 0.913000 | 0.921000 | 0.930000 | 321.000000 | 0.967000 | 0.929000 | 0.960000 | 321.000000 |
| 75% | 0.834000 | 0.987000 | 0.826000 | 461.000000 | 0.889000 | 0.762000 | 0.821000 | 461.000000 | 0.994000 | 0.988000 | 0.991000 | 461.000000 | 0.988000 | 0.987000 | 0.979000 | 461.000000 |
| max | 1.000000 | 0.989000 | 0.995000 | 764.000000 | 0.989000 | 1.000000 | 0.995000 | 764.000000 | 1.000000 | 1.000000 | 1.000000 | 764.000000 | 1.000000 | 1.000000 | 0.994000 | 764.000000 |
Reviewing the report of all for classifiers in regard to all classes I notice that:
report_all.iloc[9:12,:]
| Naive Bayes classifier | Minimum Euclidean distance classifier | k-nearest neighbor classifier | Bayes classifier | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| precision | recall | f1-score | support | precision | recall | f1-score | support | precision | recall | f1-score | support | precision | recall | f1-score | support | |
| accuracy | 0.660 | 0.660 | 0.660 | 0.66 | 0.558 | 0.558 | 0.558 | 0.558 | 0.888 | 0.888 | 0.888 | 0.888 | 0.884 | 0.884 | 0.884 | 0.884 |
| macro avg | 0.710 | 0.756 | 0.703 | 3207.00 | 0.650 | 0.661 | 0.643 | 3207.000 | 0.907 | 0.905 | 0.905 | 3207.000 | 0.904 | 0.887 | 0.893 | 3207.000 |
| weighted avg | 0.696 | 0.660 | 0.639 | 3207.00 | 0.583 | 0.558 | 0.552 | 3207.000 | 0.890 | 0.888 | 0.888 | 3207.000 | 0.887 | 0.884 | 0.884 | 3207.000 |
Reviewing the report of all for classifiers in regard to the overall performance I notice that:
In this section I test trained models eith the mixed signature HSI. For these 109 mixed spectral signatures I predict the labels.
I plot all abundance maps together and compare the results per material.
Naive Bayes classifier
# predict labels for HSI dataset
HSI_reshp = np.reshape(HSI, (300*200, 103)) # reshape HSI
# predict labels
naive_predict = GaussianNB().fit(train_hsi,
train_labels).predict(HSI_reshp)
# reshape prediction
naive_predict_reshp = np.reshape(naive_predict, (300,200))
# mask zero values
naive_predict_masked = np.ma.masked_equal(naive_predict_reshp, 0)
plt.imshow(naive_predict_masked, cmap = cmap_masked)
plt.legend(handles=patches_masked, bbox_to_anchor=(1.3, 1), loc=2, borderaxespad=0. )
plt.colorbar() # plot the color bar
<matplotlib.colorbar.Colorbar at 0x7f97eb6cad90>
Minimum Euclidean distance classifier
# predict labels for HSI dataset
HSI_reshp = np.reshape(HSI, (300*200, 103)) # reshape HSI
# predict labels
distances_euclidean = distance.cdist(average, HSI_reshp)
minimum_distances_eucl_pred = distances_euclidean.argmin(axis=0) + 1
# reshape prediction
eucl_predict_reshp = np.reshape(minimum_distances_eucl_pred, (300,200))
# mask zero values
eucl_predict_reshp_masked = np.ma.masked_equal(eucl_predict_reshp, 0)
plt.imshow(eucl_predict_reshp_masked, cmap = cmap_masked)
plt.legend(handles=patches_masked, bbox_to_anchor=(1.3, 1), loc=2, borderaxespad=0. )
plt.colorbar() # plot the color bar
<matplotlib.colorbar.Colorbar at 0x7f97e8754ca0>
k-nearest neighbor classifier
# predict labels for HSI dataset
HSI_reshp = np.reshape(HSI, (300*200, 103)) # reshape HSI
# predict labels
neigh_predict = neigh_classifier.fit(train_hsi,train_labels).predict(HSI_reshp)
# reshape prediction
neigh_predict_reshp = np.reshape(neigh_predict, (300,200))
# mask zero values
neigh_predict_masked = np.ma.masked_equal(neigh_predict_reshp, 0)
plt.imshow(neigh_predict_masked, cmap = cmap_masked)
plt.legend(handles=patches_masked, bbox_to_anchor=(1.3, 1), loc=2, borderaxespad=0. )
plt.colorbar() # plot the color bar
<matplotlib.colorbar.Colorbar at 0x7f97e8a9ca60>
Bayes classifier
# predict labels for HSI dataset
HSI_reshp = np.reshape(HSI, (300*200, 103)) # reshape HSI
# predict labels
bayes_predict = module_bayes.predict(HSI_reshp)
# reshape prediction
bayes_predict_reshp = np.reshape(bayes_predict, (300,200))
# mask zero values
bayes_predict_reshp_masked = np.ma.masked_equal(bayes_predict_reshp, 0)
plt.imshow(bayes_predict_reshp_masked, cmap = cmap_masked)
plt.legend(handles=patches_masked, bbox_to_anchor=(1.3, 1), loc=2, borderaxespad=0. )
plt.colorbar() # plot the color bar
<matplotlib.colorbar.Colorbar at 0x7f97eb6be7f0>
# set a function to plot each label in order to compare on a
# one-by-one basis
def function_to_plot_classifiers(endmember):
fig, [ax1, ax2, ax3, ax4, ax5] = plt.subplots(figsize = (30,20), nrows = 1, ncols = 5)
divider5 = make_axes_locatable(ax5)
cax5 = divider5.append_axes('right', size='5%', pad=0.05)
im1 = ax1.imshow(np.ma.masked_where(naive_predict_masked != (endmember + 1),
naive_predict_masked),
cmap = ListedColormap([color_dict[endmember + 1]]))
im2 = ax2.imshow(np.ma.masked_where(eucl_predict_reshp_masked != (endmember + 1),
eucl_predict_reshp_masked),
cmap = ListedColormap([color_dict[endmember + 1]]))
im3 = ax3.imshow(np.ma.masked_where(neigh_predict_masked != (endmember + 1),
neigh_predict_masked),
cmap = ListedColormap([color_dict[endmember + 1]]))
im4 = ax4.imshow(np.ma.masked_where(bayes_predict_reshp_masked != (endmember + 1),
bayes_predict_reshp_masked),
cmap = ListedColormap([color_dict[endmember + 1]]))
im5 = ax5.imshow(np.ma.masked_where(labels != (endmember + 1),
labels),
cmap = ListedColormap([color_dict[endmember + 1]]))
fig.colorbar(im5, cax=cax5, orientation='vertical')
#plt.subplots_adjust(right = 1)
ax1.set_title('Naive Bayes\nclassifier',
fontsize = 25)
ax2.set_title('Minimum Euclidean\ndistance classifier', fontsize = 25)
ax3.set_title('k-nearest neighbor\nclassifier', fontsize = 25)
ax4.set_title('Bayes classifier', fontsize = 25)
ax5.set_title('Ground truth for\nendmember %d' %
((endmember+1)), fontsize = 25)
# ax6.legend(handles=patches_masked,
# bbox_to_anchor=(1.3, 1),
# loc=2, borderaxespad=0.)
fig.suptitle('Reconstructed abundance maps vs ground truth (Label %s) for %s' %
((endmember+1), materials_masked[endmember+1]), fontsize = 30, y= 0.75)
function_to_plot_classifiers(0)
Comparing the ground truth for water against the abundance maps of the classifiers I notice the following:
function_to_plot_classifiers(1)
Comparing the ground truth for trees against the abundance maps of the classifiers I notice the following:
function_to_plot_classifiers(2)
Comparing the ground truth for asphalt against the abundance maps of the classifiers I notice the following:
function_to_plot_classifiers(3)
Comparing the ground truth for bricks against the abundance maps of the classifiers I notice the following:
function_to_plot_classifiers(4)
Comparing the ground truth for bitumen against the abundance maps of the classifiers I notice the following:
function_to_plot_classifiers(5)
Comparing the ground truth for tiles against the abundance maps of the classifiers I notice the following:
In all four cases, the result in identifying the tiles is not satisfactory.
function_to_plot_classifiers(6)
Comparing the ground truth for shadows against the abundance maps of the classifiers I notice the following:
function_to_plot_classifiers(7)
Comparing the ground truth for meadows against the abundance maps of the classifiers I notice the following:
function_to_plot_classifiers(8)
Comparing the ground truth for bare soil against the abundance maps of the classifiers I notice the following:
None of the four classifiers are satisfactory in identifying the tiles.
Summing up, taking into account the estimated validation error, the correlation matrices and the abundance maps of the four classifier I conclude that the "winner" is the Bayes classifier, followed by the k-nearest neighbor classifier.
I plot the abundance maps of the regressors and classifiers, against the ground truth and discuss the results.
# set a function to plot each label in order to compare on a
# one-by-one basis
def plot_reg_classifiers(endmember):
fig, [[ax1, ax2, ax3, ax4, ax5], [ax6, ax7, ax8, ax9, ax10]] = plt.subplots(figsize = (30,20),
nrows = 2,
ncols = 5)
divider1 = make_axes_locatable(ax1)
divider2 = make_axes_locatable(ax2)
divider3 = make_axes_locatable(ax3)
divider4 = make_axes_locatable(ax4)
divider5 = make_axes_locatable(ax5)
divider6 = make_axes_locatable(ax6)
divider7 = make_axes_locatable(ax7)
divider8 = make_axes_locatable(ax8)
divider9 = make_axes_locatable(ax9)
divider10= make_axes_locatable(ax10)
cax1 = divider1.append_axes('right', size='5%', pad=0.05)
cax2 = divider2.append_axes('right', size='5%', pad=0.05)
cax3 = divider3.append_axes('right', size='5%', pad=0.05)
cax4 = divider4.append_axes('right', size='5%', pad=0.05)
cax5 = divider5.append_axes('right', size='5%', pad=0.05)
cax6 = divider6.append_axes('right', size='5%', pad=0.05)
cax7 = divider7.append_axes('right', size='5%', pad=0.05)
cax8 = divider8.append_axes('right', size='5%', pad=0.05)
cax9 = divider9.append_axes('right', size='5%', pad=0.05)
cax10 = divider10.append_axes('right', size='5%', pad=0.05)
im1 = ax1.imshow(thetas_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im2 = ax2.imshow(thetas_sum_one_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im3 = ax3.imshow(thetas_non_neg_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im4 = ax4.imshow(thetas_sum_one_non_neg_ar_resh[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im5 = ax5.imshow(lasso_abund_map[:,:,endmember],
cmap = cmaps_shuffle[endmember+1])
im6 = ax6.imshow(np.ma.masked_where(naive_predict_masked != (endmember + 1),
naive_predict_masked),
cmap = ListedColormap([color_dict[endmember + 1]]))
im7 = ax7.imshow(np.ma.masked_where(eucl_predict_reshp_masked != (endmember + 1),
eucl_predict_reshp_masked),
cmap = ListedColormap([color_dict[endmember + 1]]))
im8 = ax8.imshow(np.ma.masked_where(neigh_predict_masked != (endmember + 1),
neigh_predict_masked),
cmap = ListedColormap([color_dict[endmember + 1]]))
im9 = ax9.imshow(np.ma.masked_where(bayes_predict_reshp_masked != (endmember + 1),
bayes_predict_reshp_masked),
cmap = ListedColormap([color_dict[endmember + 1]]))
im10 = ax10.imshow(np.ma.masked_where(labels != (endmember + 1),
labels),
cmap = ListedColormap([color_dict[endmember + 1]]))
fig.colorbar(im10, cax=cax10, orientation='vertical')
ax1.set_title(r'Linear Reg.: No constraint',
fontsize = 25)
ax2.set_title('Sum-to-one', fontsize = 25)
ax3.set_title('Non-negativity', fontsize = 25)
ax4.set_title('Sum-to-one\n non-negativity', fontsize = 25)
ax5.set_title('LASSO', fontsize = 25)
ax6.set_title('Naive Bayes\nclassifier',
fontsize = 25)
ax7.set_title('Minimum Euclidean\ndistance classifier', fontsize = 25)
ax8.set_title('k-nearest neighbor\nclassifier', fontsize = 25)
ax9.set_title('Bayes classifier', fontsize = 25)
ax10.set_title('Ground truth \nfor endmember %d' %
((endmember+1)), fontsize = 25)
fig.suptitle('Reconstructed abundance maps for regressors and classifiers\nvs ground truth (Label %s) for %s' %
((endmember+1), materials_masked[endmember+1]), fontsize = 30, y= 0.95)
plot_reg_classifiers(0)
Comparing the ground truth for water against the abundance maps of the regressors and classifiers I notice the following:
A combination of these results could lead to an optimal identification of the course and width of the torrent. In this case I could mask the abundance map of the Bayes classifier with the zero values of the linear regression's abundance map, and obtain the optimum result.
plot_reg_classifiers(1)
Comparing the ground truth for trees against the abundance maps of the regressors and classifiers I notice the following:
A combination of these results could lead to the best result with regard to getting read of the misfits at the north-west of the campus. In this case I could mask the abundance map of the linear regression with the non-negativity constraint with the zero values of the k-nearest neighbor classifier's abundance map, and take the optimum result.
plot_reg_classifiers(2)
Comparing the ground truth for roads (asphalt) against the abundance maps of the regressors and classifiers I notice the following:
A combination of these results could lead to the best result with regard to getting read of the misfits at the north-west of the campus. In this case I could mask the abundance map of the linear regression with the non-negativity constraint with the zero values of the k-nearest neighbor classifier's abundance map, and take the optimum result.
plot_reg_classifiers(3)
Comparing the ground truth for bricks against the abundance maps of the regressors and classifiers I notice the following:
All other results have major errors.
plot_reg_classifiers(4)
Comparing the ground truth for bitumen against the abundance maps of the regressors and classifiers I notice the following:
plot_reg_classifiers(5)
Comparing the ground truth for tiles against the abundance maps of the regressors and classifiers I notice the following:
All other results have major errors.
plot_reg_classifiers(6)
Comparing the ground truth for shadows against the abundance maps of the regressors and classifiers I notice the following:
plot_reg_classifiers(7)
Comparing the ground truth for meadows against the abundance maps of the regressors and classifiers I notice the following:
plot_reg_classifiers(8)
Comparing the ground truth for the bare soil against the abundance maps of the regressors and classifiers I notice the following: